ADF Data Description Developer's Guide v1.5.3 RF

The Allotrope Data Format (ADF) [[!ADF]] consists of several APIs and taxonomies. This document constitutes the Developer's Guide for the ADF Data Description API (ADF-DD) [[!ADF-DD]]. It provides examples on how to use the ADF-DD API to store meta data of the Data Package and the Data Cubes along with experimental or process data and contextual meta data. ADF-DD is based on semantic web standards and linked data concepts using the RDF Data Model.

Disclaimer

THESE MATERIALS ARE PROVIDED "AS IS" AND ALLOTROPE EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE WARRANTIES OF NON-INFRINGEMENT, TITLE, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

Prefix	Namespace
owl:	http://www.w3.org/2002/07/owl#
rdf:	http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:	http://www.w3.org/2000/01/rdf-schema#
xsd:	http://www.w3.org/2001/XMLSchema#
skos:	http://www.w3.org/2004/02/skos/core#
dct:	http://purl.org/dc/terms/
foaf:	http://xmlns.com/foaf/0.1/
adf-dp:	http://purl.allotrope.org/ontologies/datapackage#
adf-dc:	http://purl.allotrope.org/ontologies/datacube#
af-r:	http://purl.allotrope.org/ontologies/result#
af-x:	http://purl.allotrope.org/ontologies/property#

RDF Graph Operations

The Data Description API provides functions for the core operations on the RDF graph - i.e. the data stored in the ADF Triple Store.

The API of Apache Jena [[APACHE-JENA]] constitutes the ADF Data Description API. In the following sections, the API is introduced and illustrated by examples. For a complete description of the API, please consult the JavaDoc API documentation and the sources listed in the References section.

The RDF Graph

In Apache Jena, an RDF graph [[!rdf11-concepts]] is called a model and is represented by the Model interface [[APACHE-JENA]], [[JENA-INTRO]]. The following sections illustrate how the Jena Model can be used for operations on the RDF graph. For more details on the Jena API, please refer to [[APACHE-JENA]] and [[JENA-INTRO]].

Querying an RDF Graph

The RDF graph can be queried by iterating over the Jena Model or by running SPARQL queries on the Model. The following sections illustrate both approaches.

Iterating over the RDF Graph

The Model interface of Apache Jena contains a listStatements() method that returns an interator (a StmtIterator, which is a subtype of Java's Iterator) that allows to iterate over all statements in the model. The Statement interface provides accessor methods to the subject, predicate and object of a statement.

The following example illustrates iterating over the model:

Java:
// Iterate over all RDF Statements in the DataDescription
for (Statement stmt : dataDescription.listStatements()) {

	// get the subject
	Resource subject = stmt.getSubject();

	// get the predicate
	Property predicate = stmt.getPredicate();

	// get the object
	RDFNode object = stmt.getObject();
}

C#:
// Iterate over all RDF Statements in the DataDescription
StmtIterator iter = dataDescription.listStatements();
while (iter.MoveNext())
{
	Statement stmt = iter.Current;

	// get the subject
	Resource subject = stmt.getSubject();

	// get the predicate
	Property predicate = stmt.getPredicate();

	// get the object
	RDFNode @object = stmt.getObject();
}

Querying the RDF Graph via SPARQL

This section describes how to query an RDF graph via a SPARQL query.

Example Query

The following picture shows a total ion chromatogram and the spectra that belong to it. Assume that we want to query the data highlighted in the following figure:

Representation of the Query in SPARQL

To keep it simple, the example uses human readable identifiers surrounded by guillemets (« ») instead of artificial identifiers.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX qudt: <http://qudt.org/schema/qudt#>
PREFIX af-r: <http://purl.allotrope.org/ontologies/result#>
PREFIX af-x: <http://purl.allotrope.org/ontologies/propert#>
SELECT ?scan ?scanTimeValue ?scanTimeUnit ?totalIonCurrent ?basePeakPosValue ?basePeakPosUnit ?basePeakHeightValue ?basePeakHeightUnit
WHERE {
	?spectrum a «af-r:MS1 spectrum» ;
	«af-x:total ion current» ?totalIonCurrent ;
	«af-x:base peak» ?basePeak .
	?basePeak a «af-r:peak» ;
	«af-x:total retention time» [
		qudt:numericValue ?basePeakPosValue ;
		qudt:unit ?basePeakPosUnit ;
	] ;
	«af-x:intensity» [
		qudt:numericValue ?basePeakHeightValue ;
		qudt:unit ?basePeakHeightUnit ;
	] ;
	?scan a «af-x:scan» ;
	«af-x:scanned spectrum» ?spectrum ;
	«af-x:start time» [
		qudt:numericValue ?scanTimeValue ;
		qudt:unit ?scanTimeUnit ;
	] .
}

A short query with artificial identifiers could look like this:

PREFIX qudt: <http://qudt.org/schema/qudt#>
PREFIX af-r: <http://purl.allotrope.org/ontologies/result#>
PREFIX af-x: <http://purl.allotrope.org/ontologies/propert#>
SELECT ?spectrum ?basePeakHeightValue ?basePeakHeightUnit
WHERE {
	?spectrum a af-r:AFR_0000472 ;
	af-x:AFX_0000361 [
		qudt:numericValue ?basePeakHeightValue ;
		qudt:unit ?basePeakHeightUnit ;
	] .
}

Execution of the SPARQL Query

Given a SPARQL query as queryString and a Jena Model, we can execute the query on the model as follows, using the Jena ARQ API [[JENA-ARQ]]:

JAVA:
Query query = QueryFactory.create(queryString);
try (QueryExecution qexec = QueryExecutionFactory.create(query, dataDescription)) {
	ResultSet resultSet = qexec.execSelect();
	while (resultSet.hasNext()) {
		QuerySolution solution = resultSet.nextSolution();
		// extract the queried information via solution.get("variable name")
		// and do something with the result
	}
}

C#:
Query query = QueryFactory.create(queryString);
QueryExecution qexec = QueryExecutionFactory.create(query, dataDescription);
try
{
	ResultSet results = qexec.execSelect();
	while (results.MoveNext())
	{
		QuerySolution solution = resultSet.Current;
		// extract the queried information via solution.getResource("variable name");
		// and do something with the result
	}
}
finally
{
	qexec.close();
}

In this example, we first create a Query out of the given query string. Then we execute it and finally iterate over the results.

Resuming the example from above, the execution of the query looks like this:

JAVA:
// Iterate over the ResultSet
while (resultSet.hasNext()) {
	QuerySolution soln = resultSet.nextSolution();

	// Get the Scan Time, the Total Ion Current and the Base Peak of the spectrum
	String scanTime = soln.get("scanTimeValue") + " " + soln.get("scanTimeUnit");
	String tic = soln.get("tic");
	String basePeakMZ = soln.get("basePeakPosValue") + " " + soln.get("basePeakPosUnit");
	String basePeakint = soln.get("basePeakHeightValue") + " " + soln.get("basePeakHeightUnit");
}

C#:
// Iterate over the ResultSet
while (results.MoveNext())
{
	com.hp.hpl.jena.query.QuerySolution soln = results.Current;

	// Get the Scan Time, the Total Ion Current and the Base Peak of the spectrum
	String = soln.get("scanTimeValue") + " " + soln.get("scanTimeUnit");
	String tic = soln.get("tic");
	String basePeakMZ = soln.get("basePeakPosValue") + " " + soln.get("basePeakPosUnit");
	String basePeakint = soln.get("basePeakHeightValue") + " " + soln.get("basePeakHeightUnit");
}

Please refer to [[JENA-ARQ]] for more details and additional examples on SPARQL queries with the Jena ARQ API.

Modifying Statements in an RDF Graph

Inserting Statements into an RDF Graph

The Jena Model that represents the RDF graph consists of a set of statements. A statement is a triple that consists of subject, predicate and object. Statements can be constructed and inserted into the Jena Model via createResource() and addProperty() in fluent API style:

JAVA and C#:
Resource myBalance = dataDescription.createResource(<myBalanceURI>) //
	.addLiteral(DCTerms.title, "My analytical balance that ...");

In this example, we added the triple

<myBalanceURI> dc:title 'My analytical balance that ...'

to the model, using the Dublin Core 'title' predicate. The namespace declaration for Dublin Core (dc) can be found in the Namespaces section.

Updating a Statement in the RDF Graph

Given a data cube and its representation in an RDF graph, assume we have a typo in the label of the dimension representing the intensity. The following SPARQL 1.1 Update [[!sparql11-update]], [[SEARBORNE-SPARQL]] query illustrates how this typo can be fixed:

PREFIX adf-dc: <http://purl.allotrope.org/adf/dc/1.0#>
PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

DELETE { ?dimension rdfs:label 'intensty' }
INSERT { ?dimension rdfs:label 'intensity' }
WHERE
  {
	?dimension rdfs:label 'intensty'.
	?dimension rdfs:type qb:DimensionProperty .
  }

Given the Jena Model of the RDF graph, we can execute this query using the Jena ARQ API [[JENA-ARQ]] to fix the typo:

JAVA and C#:
UpdateAction.parseExecute(queryString, dataDescription);

Removing a Statement from the RDF Graph

With SPARQL 1.1 Update, it is possible to remove statements from an RDF graph by using the DELETE operation (cp. the example of the previous section). The following query removes all archive information packages (AIP), whose retention period of 50 years has expired on 2014-12-05.

PREFIX core: <http://purl.allotrope.org/core/metadata#>

DELETE { ?aip  ?p ?v }
WHERE
  {
	?aip core:retentionTime ?date .
	FILTER ( ?date < "1964-12-05T00:00:00-02:00"^^xsd:dateTime )
	?aip ?p ?v?
  }

Given the Jena Model of the RDF graph, we can execute this query using the Jena ARQ API [[JENA-ARQ]]:

JAVA and C#:
UpdateAction.parseExecute(queryString, dataDescription);

Version	Release Date	Remarks
0.4.0	2015-06-29	Initial Working Draft version
1.0.0 RC	2015-09-17	Renamed document from User Manual to Developer's Guide Renamed section Example Code to Complete Example Removed unnecessary sub section from section Complete Example Added provenance information to the Complete Example
1.0.0	2015-09-29	Updated versions, dates and document status Updated introduction Removed code from section Complete Example
1.1.0 RC	2016-03-11	Updated versions, dates and document status Added section on number formatting to document conventions Added information and examples for C#/.NET
1.1.0 RF	2016-03-31	Updated versions, dates and document status
1.1.5	2016-05-13	Updated versions and dates
1.2.0 Preview	2016-09-23	Updated versions and dates
1.2.0 RC	2016-12-07	Updated versions and dates
1.3.0 Preview	2017-03-31	Updated versions and dates Updated section 2.3.3 (Example 9)
1.3.0 RF	2017-06-30	Updated versions and dates
1.4.3 RC	2018-10-11	Updated versions and dates
1.4.5 RF	2018-12-17	Updated versions and dates
1.5.0 RC	2019-12-12	Updated versions and dates
1.5.0 RF	2020-03-03	Updated HDF5 reference link
1.5.3 RF	2020-11-30	Updated broken reference links Updated PURL and DOCS server links to relative links Reformat the document header